Methods for incorporating the hypermutability of CpG dinucleotides in detecting natural selection operating at the amino acid sequence level.
نویسندگان
چکیده
In detecting natural selection operating at the amino acid sequence level by comparing the rates of synonymous (r(S)) and nonsynonymous (r(N)) substitutions, the rates of synonymous and nonsynonymous mutations are assumed to be approximately the same. In reality, however, these rates may not be the same if different proportions of synonymous and nonsynonymous sites overlap with CpG dinucleotides, which are known to be hypermutable in some organisms. Here, we develop the evolutionary pathway methods for comparing r(S) and r(N) at multiple codon sites (all-sites analysis) and at single codon sites (single-site analysis) that take into account the hypermutability at CpG dinucleotides in estimating the number of synonymous substitutions per synonymous site (d(S)) and nonsynonymous substitutions per nonsynonymous site (d(N)). Computer simulations show that the direction and magnitude of the bias in the estimation of d(N)/d(S) caused by the hypermutability of CpGs are determined by both the number of CpGs and the relative proportions of synonymous and nonsynonymous sites overlapping with CpGs. This bias is greatly reduced when using the methods we propose to account for the hypermutability of CpG dinucleotides. In an all-sites analysis of protamine 1 genes from primates, d(N)/d(S) > 1 was observed for many pairs if the hypermutability was ignored. However, d(N)/d(S) becomes <or=1 for most of these pairs when the CpG sites are assumed to be hypermutable. Therefore, statistical indications of positive selection in some sequences or individual codons may be caused by mutation rate differences in synonymous and nonsynonymous sites.
منابع مشابه
Exploiting CpG Hypermutability to Identify Phenotypically Significant Variation Within Human Protein-Coding Genes
The CpG dinucleotide is disproportionately represented in human genetic variation due to the hypermutability of 5-methyl-cytosine (5mC). We exploit this hypermutability and a novel codon substitution model to identify candidate functionally important exonic nucleotides. Population genetic theory suggests that codon positions with high cross-species CpG frequency will derive from stronger purify...
متن کاملReview of "clustering for data mining: a data recovery approach" by Boris Mirkin
Background. Nullomers are short DNA sequences that are absent from the genomes of humans and other species. Assuming that nullomers are the signatures of natural selection against deleterious sequences in humans, the use of nullomers in drug target identification, pesticide development, environmental monitoring, and forensic applications has been envisioned. Results. Here, we show that the hype...
متن کاملNullomers: Really a Matter of Natural Selection?
BACKGROUND Nullomers are short DNA sequences that are absent from the genomes of humans and other species. Assuming that nullomers are the signatures of natural selection against deleterious sequences in humans, the use of nullomers in drug target identification, pesticide development, environmental monitoring, and forensic applications has been envisioned. RESULTS Here, we show that the hype...
متن کاملHigher intensity of purifying selection on >90% of the human genes revealed by the intrinsic replacement mutation rates.
For over 3 decades, the rate of replacement mutations has been assumed to be equal to, and estimated from, the rate of "strictly" neutral sequence divergence in noncoding regions and in silent-codon positions where mutations do not alter the amino acid encoded. This assumption is fundamental to estimating the fraction of harmful protein mutations and to identifying adaptive evolution at individ...
متن کاملPhylogenetic Window Analysis for Detecting Chronological Changes in Natural Selection
Natural selection operating at the amino acid sequence level can be detected by comparing the rates of synonymous (rS) and nonsynonymous (rN) substitutions for the protein-coding nucleotide sequence, where relationships rN > rS and rN < rS conventionally indicate positive and negative selection, respectively. The direction and magnitude of natural selection operating on a protein may change dur...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Molecular biology and evolution
دوره 26 10 شماره
صفحات -
تاریخ انتشار 2009